Gradient Estimates of Return

نویسندگان

  • Christos Dimitrakakis
  • Samy Bengio
چکیده

The exploration-exploitation trade-off that arises when one considers simple point estimates of expected returns no longer appears when full distributions are considered. This work develops a simple gradient-based approach for mainting such distributions and investigates methods for using them to direct exploration.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gradient-Based Estimates of Return Distributions

We present a general method for maintaining estimates of the distribution of parameters in arbitrary models. This is then applied to the estimation of probability distributions over actions in value-based reinforcement learning. While this approach is similar to other techniques that maintain a confidence measure for action-values, it nevertheless offers an insight into current techniques and h...

متن کامل

تخمین همزمان مارک ـ آپ و بازدهی نسبت به مقیاس در صنایع کارخانه‌ای ایران

The current study is an attempt to estimate markup and return to scale of 19 two-digit ISIC manufacturing industries of Iran, simultaneously, in accordance to Solow Residual and Structural approach, during the period 1995-2007. Based on Solow Residual approach, the neoclassical assumption of constant return to scale is approved within 95% of manufacturing industries; however in 84% of industrie...

متن کامل

تحلیل فراوانی منطقه‌ای سیلاب با استفاده از روش کریجینگ متعارف در حوزه های آبخیز استان مازندران

Regional analysis is the stability method to improve estimates of flood frequency, which has become one of the dynamic sectors in hydrology and the new theories are testing, constantly. Application of geostatistical method is an innovation in this field for regional flood analysis.This technique is based on the interpolation of hydrological variables in the physiographical space instead of usin...

متن کامل

Modeling Volatility Spillovers in Iran Capital Market

This paper investigates the conditional correlations and volatility spillovers between the dollar exchange rate return, gold coin return and crude oil return to stock index return. Monthly returns in the 144 observations (2005 - 2017) are analyzed by constant conditional correlation, dynamic conditional correlation, VARMA-GARCH and VARMA-AGARCH models. So this paper presents interdependences in...

متن کامل

Bayesian Policy Gradient and Actor-Critic Algorithms

Policy gradient methods are reinforcement learning algorithms that adapt a parameterized policy by following a performance gradient estimate. Many conventional policy gradient methods use Monte-Carlo techniques to estimate this gradient. The policy is improved by adjusting the parameters in the direction of the gradient estimate. Since Monte-Carlo methods tend to have high variance, a large num...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005